🤗Accelerate

Topic	Replies	Views	Activity
About the 🤗Accelerate category	1	2413	February 20, 2022
Synchronizing State, Trainer and Accelerate	3	14	May 22, 2025
[RuntimeError] DPOTrainer - "element 0 of tensors does not require grad and does not have a grad_fn" on 8x A100 GPUs	1	21	May 20, 2025
Reproduce SFTTrainer with Accelerate and Pytorch	0	26	May 18, 2025
11B model gets OOM after using deepspeed zero 3 setting with 8 32G V100	2	1206	April 26, 2025
Multi-gpu inference llama-3.2 vision with QLoRA	4	81	April 25, 2025
How to work with meta tensors?	1	2081	April 16, 2025
BitsandBytes conflict with Accelerate	6	324	April 14, 2025
Issues with Dataset Loading and Checkpoint Saving using FSDP with HuggingFace Trainer on SLURM Multi-Node Setup	1	68	April 7, 2025
Meta device error while instantiating model	5	6834	April 1, 2025
Saving bf16 Model Weights When Using Accelerate+DeepSpeed	4	335	March 17, 2025
Cannot run multi GPU training on SLURM	1	79	March 16, 2025
Fp8 error in accelerate test	1	92	March 11, 2025
Accelerator .prepare() replaces custom DataLoader Sampler	5	1241	March 9, 2025
Using large dataset with accelerate	0	39	March 6, 2025
Accelerator.save_state errors out due to timeout. Unable to increase timeout through kwargs_handlers	5	1249	March 3, 2025
HF accelerate DeepSpeed plugin does not use custom optimizer or scheduler	2	19	March 1, 2025
Bug on multi-gpu trainer with accelerate	6	339	February 18, 2025
Accelerate remain stuck on using GPU 5 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular devic	1	793	February 17, 2025
Errors when using gradient accumulation with FSDP + PEFT LoRA + SFTTrainer	2	961	February 6, 2025
Save accelerate model	4	539	February 5, 2025
Calling other large models at runtime?	0	7	February 3, 2025
Training using FSDP, qLoRa on multinode	0	56	January 29, 2025
Are helper methods also in parallel?	0	10	January 27, 2025
Using device_map='auto' for training	5	34842	January 24, 2025
ValueError: The model has been loaded with `accelerate` and therefore cannot be moved to a specific device. Please discard the `device` argument when creating your pipeline object	5	201	January 20, 2025
Problems with hanging process at the end when using dataloaders on each process	5	4430	January 1, 2025
The used dataset had no length, returning gathered tensors. You should drop the remainder yourself	4	234	December 26, 2024
Grad Accumulation in FSDP	1	38	December 26, 2024
AttributeError: 'AcceleratorState' object has no attribute 'distributed_type', Llama 2 70B Fine-tuning, using 'accelerate' on a single GPU	1	1017	December 25, 2024